\htmlhr
% Reinstate when lists are supported:
% \chapterAndLabel{Index Checker for sequence bounds (arrays, strings, lists)}{index-checker}
\chapterAndLabel{Index Checker for sequence bounds (arrays and strings)}{index-checker}

The Index Checker warns about potentially out-of-bounds accesses to sequence
data structures, such as arrays
% , lists,
and strings.

The Index Checker prevents \<IndexOutOfBoundsException>s that result from
an index expression that might be negative or might be equal to or larger
than the sequence's length.
It also prevents \<NegativeArraySizeException>s that result from a negative
array dimension in an array creation expression.
(A caveat: the Index Checker does not check for arithmetic overflow. If
an expression overflows, the Index Checker might fail to warn about a
possible exception.  This is unlikely to be a problem in practice unless
you have an array whose length is \<Integer.MAX\_VALUE>.)

% Here's a pathological example of overflow leading to unsoundness:
%
% public class IndexOverflow {
%     public static void main(String[] args) {
%         @Positive int x = 1073741825; // 2 ^ 30 + 1
%         @Positive int x2 = x + x; // 2 ^ 31 + 2 == - 2 ^ 31 + 2
%         int[] a = new int[0];
%         if (x2 < a.length) {
%             a[x2] = 42;
%         }
%     }
% }

The programmer can write annotations that indicate which expressions are
indices for which sequences.  The Index Checker prohibits any operation that
may violate these properties, and the Index Checker takes advantage of
these properties when verifying indexing operations.
%
Typically, a programmer writes few annotations, because the Index Checker
infers properties of indexes from
the code around them. For example, it will infer that \<x> is positive
within the \<then> block of an \code{if (x > 0)} statement.
The programmer does need to write field types and method pre-conditions or post-conditions. For instance,
if a method's formal parameter is used as an index for
\<myArray>, the programmer might need to
write an \refqualclasswithparams{checker/index/qual}{IndexFor}{"myArray"}
annotation on the formal parameter's types.

The Index Checker checks fixed-size data structures, whose size is never
changed after creation.  A fixed-size data structure has no \<add> or
\<remove> operation.  Examples are strings and arrays, and you can add
support for other fixed-size data structures (see
Section~\ref{index-annotating-fixed-size}).

To run the Index Checker, run either of these commands:

\begin{alltt}
  javac -processor index \emph{MyJavaFile}.java
  javac -processor org.checkerframework.checker.index.IndexChecker \emph{MyJavaFile}.java
\end{alltt}

Recall that in Java, type annotations are written before the type;
in particular,
array annotations appear immediately before ``\<[]>''.
Here is how to declare a length-9 array of positive integers:

\begin{Verbatim}
  @Positive int @ArrayLen(9) []
\end{Verbatim}

Multi-dimensional arrays are similar.
Here is how to declare a length-2 array of length-4 arrays:

\begin{Verbatim}
  String @ArrayLen(2) [] @ArrayLen(4) []
\end{Verbatim}


\sectionAndLabel{Index Checker structure and annotations}{index-annotations}

Internally, the Index Checker computes information about integers that
might be indices:
\begin{itemize}
\item
  the lower bound on an integer, such as whether it is known to be positive
  (Section~\ref{index-lowerbound})
\item
  the upper bound on an integer, such as whether it is less than the length
  of a given sequence (Section~\ref{index-upperbound})
\item
  whether an integer came from calling the JDK's binary search routine on
  an array (Section~\ref{index-searchindex})
\item
  whether an integer came from calling a string search routine
  (Section~\ref{index-substringindex})
\end{itemize}

\noindent
and about sequence lengths:
\begin{itemize}
\item
  the minimum length of a sequence, such ``\<myArray> contains at least 3
  elements'' (Section~\ref{index-minlen})
\item
  whether two sequences have the same length (Section~\ref{index-samelen})
\end{itemize}

The Index Checker checks of all these properties at once, but
this manual discusses each type system in a different section.
There are some annotations that are shorthand for writing multiple
annotations, each from a different type system:

\begin{description}
\item[\refqualclasswithparams{checker/index/qual}{IndexFor}{String[] names}]
  The value is a valid index for the named sequences.  For example, the
  \sunjavadoc{java.base/java/lang/String.html\#charAt(int)}{String.charAt(int)}
  method is declared as

  \begin{Verbatim}
  class String {
    char charAt(@IndexFor("this") int index) { ... }
  }
  \end{Verbatim}

  More generally, a variable
  declared as \<@IndexFor("someArray") int i> has type
  \<@IndexFor("someArray") int> and its run-time value is guaranteed to be
  non-negative and less than the length of \<someArray>.  You could also
  express this as
  \<\refqualclass{checker/index/qual}{NonNegative}
  \refqualclasswithparams{checker/index/qual}{LTLengthOf}{"someArray"}
  int i>,
  but \<@IndexFor("someArray") int i> is more concise.

 \item[\refqualclasswithparams{checker/index/qual}{IndexOrHigh}{String[] names}]
   The value is non-negative and is less than or equal to the length of
   each named sequence.  This type combines
  \refqualclass{checker/index/qual}{NonNegative} and
  \refqualclass{checker/index/qual}{LTEqLengthOf}.

  For example, the
  \sunjavadoc{java.base/java/util/Arrays.html\#fill(java.lang.Object\%5B\%5D,int,int,java.lang.Object)}{Arrays.fill}
   method is declared as

  \begin{mysmall}
  \begin{Verbatim}
  class Arrays {
    void fill(Object[] a, @IndexFor("#1") int fromIndex, @IndexOrHigh("#1") int toIndex, Object val)
  }
  \end{Verbatim}
  \end{mysmall}

 \item[\refqualclasswithparams{checker/index/qual}{LengthOf}{String[] names}]
   The value is exactly equal to the length of the named
   sequences. In the implementation, this type aliases
   \refqualclass{checker/index/qual}{IndexOrHigh}, so writing it
   only adds documentation (although future versions of the Index Checker
   may use it to improve precision).

 \item[\refqualclasswithparams{checker/index/qual}{IndexOrLow}{String[] names}]
   The value is -1 or is a valid index for
   each named sequence.  This type combines
  \refqualclass{checker/index/qual}{GTENegativeOne} and
  \refqualclass{checker/index/qual}{LTLengthOf}.

%  Example commented out; IndexOrLow is not sound for indexOf, because "".indexOf("") returns 0
%
%  For example, the
%  \sunjavadoc{java.base/java/lang/String.html\#indexOf(java.lang.String)}{String.indexOf(String)}
%  method is declared as
%
%  \begin{Verbatim}
%  class String {
%    @IndexOrLow("this") int indexOf(String str) { ... }
%  }
%  \end{Verbatim}

 \item[\refqualclass{checker/index/qual}{PolyIndex}]
   indicates qualifier polymorphism.  This type combines
   \refqualclass{checker/index/qual}{PolyLowerBound} and
   \refqualclass{checker/index/qual}{PolyUpperBound}.
   For a description of qualifier polymorphism, see
   Section~\ref{method-qualifier-polymorphism}.

 \item[\refqualclass{checker/index/qual}{PolyLength}]
   is a special polymorphic qualifier that combines
   \refqualclass{checker/index/qual}{PolySameLen} and
   \refqualclass{common/value/qual}{PolyValue} from the
   Constant Value Checker (see \chapterpageref{constant-value-checker}).
   \refqualclass{checker/index/qual}{PolyLength} exists
   as a shorthand for these two annotations, since
   they often appear together.

\end{description}

\sectionAndLabel{Lower bounds}{index-lowerbound}

The Index Checker issues an error when
a sequence is indexed by an integer that might be negative.
The Lower Bound Checker uses a type system (Figure~\ref{fig-index-int-types}) with the following
qualifiers:

\begin{description}
\item[\refqualclass{checker/index/qual}{Positive}]
  The value is 1 or greater, so it is not too low to be used as an index.
  Note that this annotation is trusted by the Constant Value Checker,
  so if the Constant Value Checker is run on code containing this annotation,
  the Lower Bound Checker must be run on the same code in order to
  guarantee soundness.
\item[\refqualclass{checker/index/qual}{NonNegative}]
  The value is 0 or greater, so it is not too low to be used as an index.
\item[\refqualclass{checker/index/qual}{GTENegativeOne}]
  The value is -1 or greater.
  It may not be used as an index for a sequence, because it might be too low.
  (``\<GTE>'' stands for ``Greater Than or Equal to''.)
\item[\refqualclass{checker/index/qual}{PolyLowerBound}]
  indicates qualifier polymorphism.
  For a description of qualifier polymorphism, see
  Section~\ref{method-qualifier-polymorphism}.
\item[\refqualclass{checker/index/qual}{LowerBoundUnknown}]
  There is no information about the value.
  It may not be used as an index for a sequence, because it might be too low.
\item[\refqualclass{checker/index/qual}{LowerBoundBottom}]
    The value cannot take on any integral types. The bottom type, which
    should not need to be written by the programmer.
\end{description}

\begin{figure}
\begin{center}
  \hfill
  \includeimagenocentering{lowerbound}{5cm}
  ~~~~\hfill~~~~
  \includeimagenocentering{upperbound}{7cm}
  \hfill
\end{center}
  \caption{The two type hierarchies for integer types used by the Index
    Checker.  On the left is a type system for lower bounds.  On the right
    is a type system for upper bounds.  Qualifiers written in gray should
    never be written in source code; they are used internally by the type
    system.
    % Using "\\" works for some but not all installations of LaTeX.
    \newline
    In the Upper Bound type system, subtyping rules depend on both the
    array name (\<"myArray">, in the figure) and on the offset (which is 0,
    the default, in the figure).  Another qualifier is
    \refqualclass{checker/index/qual}{UpperBoundLiteral}, whose subtyping
    relationships depend on its argument and on offsets for other qualifiers.
 }
  \label{fig-index-int-types}
\end{figure}


\sectionAndLabel{Upper bounds}{index-upperbound}

The Index Checker issues an error when a sequence index might be
too high. To do this, it maintains information about which expressions are
safe indices for which sequences.
The length of a sequence is \code{arr.length} for arrays and
\code{str.length()} for strings.
It uses a type system (Figure~\ref{fig-index-int-types}) with the following
qualifiers:

It issues an error when a sequence \code{arr}
is indexed by an integer that is not of type \code{@LTLengthOf("arr")}
or \code{@LTOMLengthOf("arr")}.

\begin{description}

\item[\refqualclasswithparams{checker/index/qual}{LTLengthOf}{String[] names, String[] offset}]
  An expression with this type
  has value less than the length of each sequence listed in \<names>.
  The expression may be used as an index into any of those sequences,
  if it is non-negative.
  For example, an expression of type \code{@LTLengthOf("a") int} might be
  used as an index to \<a>.
  The type \code{@LTLengthOf(\{"a", "b"\})} is a subtype of both
  \code{@LTLengthOf("a")} and \code{@LTLengthOf("b")}.
  (``\<LT>'' stands for ``Less Than''.)

  \<@LTLengthOf> takes an optional \<offset> element, meaning that the
  annotated expression plus the offset is less than the length of the given
  sequence.  For example, suppose expression \<e> has type \<@LTLengthOf(value
  = \{"a", "b"\}, offset = \{"-1", "x"\})>. Then \<e - 1> is less than
  \<a.length>, and \<e + x> is less than \<b.length>.  This helps to make
  the checker more precise.  Programmers rarely need to write the \<offset>
  element.

\item[\refqualclasswithparams{checker/index/qual}{LTEqLengthOf}{String[] names}]
  An expression with this type
  has value less than or equal to the length of each sequence listed in \<names>.
  It may not be used as an index for these sequences, because it might be too high.
  \code{@LTEqLengthOf(\{"a", "b"\})} is a subtype of both
  \code{@LTEqLengthOf("a")} and \code{@LTEqLengthOf("b")}.
  (``\<LTEq>'' stands for ``Less Than or Equal to''.)

  \<@LTEqLengthOf(\{"a"\})> = \<@LTLengthOf(value=\{"a"\}, offset=-1)>, and \\
  \<@LTEqLengthOf(value=\{"a"\}, offset=x)> = \<@LTLengthOf(value=\{"a"\},
  offset=x-1)> for any x.

\item[\refqualclasswithparams{checker/index/qual}{LTOMLengthOf}{String[] names}]
  An expression with this type
  has value at least 2 less than the length of each sequence listed in \<names>.
  It may always used as an index for a sequence listed in \<names>, if it is
  non-negative.

  This type exists to allow the checker to infer the safety of loops of
  the form:
\begin{Verbatim}
  for (int i = 0; i < array.length - 1; ++i) {
    arr[i] = arr[i+1];
  }
\end{Verbatim}
  This annotation should rarely (if ever) be written by the programmer; usually
  \refqualclasswithparams{checker/index/qual}{LTLengthOf}{String[] names}
  should be written instead.
  \code{@LTOMLengthOf(\{"a", "b"\})} is a subtype of both
  \code{@LTOMLengthOf("a")} and \code{@LTOMLengthOf("b")}.
  (``\<LTOM>'' stands for ``Less Than One Minus'', because another way of
  saying ``at least 2 less than \<a.length>'' is ``less than \<a.length-1>''.)

  \<@LTOMLengthOf(\{"a"\})> = \<@LTLengthOf(value=\{"a"\}, offset=1)>, and \\
  \<@LTOMLengthOf(value=\{"a"\}, offset=x)> = \<@LTLengthOf(value=\{"a"\},
  offset=x+1)> for any x.

\item[\refqualclass{checker/index/qual}{UpperBoundLiteral}]
  represents a constant value, typically a literal written in source code.
  Its subtyping relationship is:
  \<@UpperBoundLiteral(lit)> <: \<LTLengthOf(value="myArray", offset=off)>
  if \<lit>+\<offset> $\le$ -1.

\item[\refqualclass{checker/index/qual}{PolyUpperBound}]
  indicates qualifier polymorphism.
  For a description of qualifier polymorphism, see
  Section~\ref{method-qualifier-polymorphism}.

\item[\refqualclass{checker/index/qual}{UpperBoundUnknown}]
  There is no information about the upper bound on the value of an expression with this type.
  It may not be used as an index for a sequence, because it might be too high.
  This type is the top type, and should never need to be written by the
  programmer.

\item[\refqualclass{checker/index/qual}{UpperBoundBottom}]
  This is the bottom type for the upper bound type system. It should
  never need to be written by the programmer.

\end{description}

The following method annotations can be used to establish a method postcondition
that ensures that a certain expression is a valid index for a sequence:

\begin{description}
\item[\refqualclasswithparams{checker/index/qual}{EnsuresLTLengthOf}{String[] value, String[] targetValue, String[] offset}]
  When the method with this annotation returns, the expression (or all the expressions) given in the \code{value} element
  is less than the length of the given sequences with the given offsets. More precisely, the expression
  has the \code{@LTLengthOf} qualifier with the \code{value} and \code{offset} arguments
  taken from the \code{targetValue} and \code{offset} elements of this annotation.
\item[\refqualclasswithparams{checker/index/qual}{EnsuresLTLengthOfIf}{String[] expression, boolean result, String[] targetValue, String[] offset}]
  If the method with this annotation returns the given boolean value,
  then the given expression (or all the given expressions)
  is less than the length of the given sequences with the given offsets.
\end{description}

There is one declaration annotation that indicates the relationship between
two sequences:

\begin{description}
\item[\refqualclasswithparams{checker/index/qual}{HasSubsequence}{String[]
    value, String[] from, String[] to}]
  indicates that a subsequence (from \code{from} to \code{to}) of the
  annotated sequence is equal to some other sequence, named by
  \code{value}).

For example, to indicate that \<shorter> is a subsequence of \<longer>:

\begin{Verbatim}
  int start;
  int end;
  int[] shorter;
  @HasSubsequence(value="shorter", from="this.start", to="this.end")
  int[] longer;
\end{Verbatim}

Thus, a valid index into \<shorter> is also a valid index (between
\code{start} and \code{end-1} inclusive) into \<longer>.  More generally,
if \code{x} is \code{@IndexFor("shorter")} in the example above, then
\code{start + x} is \code{@IndexFor("longer")}. If \code{y} is
\code{@IndexFor("longer")} and \code{@LessThan("end")}, then \code{y -
  start} is \code{@IndexFor("shorter")}. Finally, \code{end - start} is
\code{@IndexOrHigh("shorter")}.

This annotation is in part checked and in part trusted.  When an array is
assigned to \code{longer}, three facts are checked: that \code{start} is
non-negative, that \code{start} is less than or equal to \code{end}, and
that \code{end} is less than or equal to the length of \code{longer}.  This
ensures that the indices are valid. The programmer must manually verify
that the value of \code{shorter} equals the subsequence that they describe.
\end{description}


\sectionAndLabel{Sequence minimum lengths}{index-minlen}

The Index Checker estimates, for each sequence expression, how long its value
might be at run time by computing a minimum length that
the sequence is guaranteed to have.  This enables the Index Checker to
verify indices that are compile-time constants.  For example, this code:

\begin{Verbatim}
  String getThirdElement(String[] arr) {
    return arr[2];
  }
\end{Verbatim}

\noindent
is legal if \<arr> has at least three elements, which can be indicated
in this way:

\begin{Verbatim}
  String getThirdElement(String @MinLen(3) [] arr) {
    return arr[2];
  }
\end{Verbatim}

When the index is not a compile-time constant, as in \<arr[i]>, then the
Index Checker depends not on a \<@MinLen> annotation but on \<i> being
annotated as
\refqualclasswithparams{checker/index/qual}{LTLengthOf}{"arr"}.

The MinLen type qualifier is implemented in practice by the Constant Value Checker,
using \<@ArrayLenRange> annotations (see \chapterpageref{constant-value-checker}).
This means that errors related to the minimum lengths of arrays must be suppressed using
the "value" argument to \<@SuppressWarnings>.
\refqualclass{common/value/qual}{ArrayLenRange} and \refqualclass{common/value/qual}{ArrayLen}
annotations can also be used to establish the minimum length of a sequence, if a
more precise estimate of length is known. For example,
if \<arr> is known to have exactly three elements:

\begin{Verbatim}
  String getThirdElement(String @ArrayLen(3) [] arr) {
    return arr[2];
  }
\end{Verbatim}

The following type qualifiers (from \chapterpageref{constant-value-checker})
can establish the minimum length of a sequence:

\begin{description}
\item[\refqualclasswithparams{common/value/qual}{MinLen}{int value}]
  The value of an expression of this type is a sequence with at least
  \code{value} elements.  The default annotation is
  \code{@MinLen(0)}, and it may be applied to non-sequences.
  \code{@MinLen($x$)} is a subtype of \code{@MinLen($x-1$)}.
  An \code{@MinLen} annotation is treated internally as an
  \refqualclass{common/value/qual}{ArrayLenRange} with only its
  \code{from} field filled.
\item[\refqualclasswithparams{common/value/qual}{ArrayLen}{int[] value}]
  The value of an expression of this type is a sequence whose
  length is exactly one of the integers listed in its argument.
  The argument can contain at most ten integers; larger collections of
  integers are converted to \refqualclass{common/value/qual}{ArrayLenRange}
  annotations. The minimum length of a sequence with this annotation
  is the smallest element of the argument.
\item[\refqualclasswithparams{common/value/qual}{ArrayLenRange}{int from, int to}]
  The value of an expression of this type is a sequence whose
  length is bounded by its arguments, inclusive.
  The minimum length of a sequence with this annotation is its \<from> argument.
\end{description}

\begin{figure}
\begin{center}
  \hfill
  \includeimage{samelen}{5cm}
  \hfill
\end{center}
  \caption{The type hierarchy for arrays of equal length ("a" and "b" are
    assumed to be in-scope sequences).  Qualifiers
    written in gray should never be written in source code; they are used
    internally by the type system.}
  \label{fig-index-array-types}
\end{figure}

The following method annotation can be used to establish a method postcondition
that ensures that a certain sequence has a minimum length:

\begin{description}
\item[\refqualclasswithparams{common/value/qual}{EnsuresMinLenIf}{String[] expression, boolean result, int targetValue}]
  If the method with this annotation returns the given boolean value,
  then the given expression (or all the given expressions) is a sequence
  with at least \code{targetValue} elements.
\end{description}

\sectionAndLabel{Sequences of the same length}{index-samelen}

The Index Checker determines whether two or more sequences have the same length.
This enables it to verify that all the indexing operations are safe in code
like the following:

\begin{Verbatim}
  boolean lessThan(double[] arr1, double @SameLen("#1") [] arr2) {
    for (int i = 0; i < arr1.length; i++) {
      if (arr1[i] < arr2[i]) {
        return true;
      } else if (arr1[i] > arr2[i]) {
        return false;
      }
    }
    return false;
  }
\end{Verbatim}

When needed, you can specify which sequences have the same length using the following type qualifiers (Figure~\ref{fig-index-array-types}):

\begin{description}
\item[\refqualclasswithparams{checker/index/qual}{SameLen}{String[] names}]
  An expression with this type represents a sequence that has the
  same length as the other sequences named in \<names>. In general,
  \code{@SameLen} types that have non-intersecting sets of names
  are \textit{not} subtypes of each other. However, if at least one
  sequence is named by both types, the types are actually the same,
  because all the named sequences must have the same length.
\item[\refqualclass{checker/index/qual}{PolySameLen}]
  indicates qualifier polymorphism.
  For a description of qualifier polymorphism, see
  Section~\ref{method-qualifier-polymorphism}.
\item[\refqualclass{checker/index/qual}{SameLenUnknown}]
  No information is known about which other sequences have the same length
  as this one.
  This is the top type, and programmers should never need to write it.
\item[\refqualclass{checker/index/qual}{SameLenBottom}]
  This is the bottom type, and programmers should rarely need to write it.
  \code{null} has this type.
\end{description}


\sectionAndLabel{Binary search indices}{index-searchindex}

The JDK's
\sunjavadoc{java.base/java/util/Arrays.html\#binarySearch(java.lang.Object\%5B\%5D,java.lang.Object)}{Arrays.binarySearch}
method returns either where the value was found, or a negative value
indicating where the value could be inserted.  The Search Index Checker
represents this concept.

\begin{figure}
\begin{center}
  \hfill
  \includeimage{searchindex}{7cm}
  \hfill
\end{center}
  \caption{The type hierarchy for the Index Checker's internal type system
  that captures information about the results of calls to
  \sunjavadoc{java.base/java/util/Arrays.html\#binarySearch(java.lang.Object\%5B\%5D,java.lang.Object)}{Arrays.binarySearch}.}
  \label{fig-index-searchindex}
\end{figure}

The Search Index Checker's type hierarchy (Figure~\ref{fig-index-searchindex}) has four type qualifiers:
\begin{description}
\item[\refqualclasswithparams{checker/index/qual}{SearchIndexFor}{String[] names}]
  An expression with this type represents an integer that could have been
  produced by calling
  \sunjavadoc{java.base/java/util/Arrays.html\#binarySearch(java.lang.Object\%5B\%5D,java.lang.Object)}{Arrays.binarySearch}:
  for each array \<a> specified in the annotation, the annotated integer is
  between \<-a.length-1> and \<a.length-1>, inclusive
\item[\refqualclasswithparams{checker/index/qual}{NegativeIndexFor}{String[] names}]
  An expression with this type represents a ``negative index'' that is
  between \<a.length-1> and \<-1>, inclusive; that is, a value that is both
  a \<@SearchIndex> and is negative.  Applying the bitwise complement
  operator (\verb|~|) to an expression of this type produces an expression
  of type \refqualclass{checker/index/qual}{IndexOrHigh}.
\item[\refqualclass{checker/index/qual}{SearchIndexBottom}]
  This is the bottom type, and programmers should rarely need to write it.
\item[\refqualclass{checker/index/qual}{SearchIndexUnknown}]
  No information is known about whether this integer is a search index.
  This is the top type, and programmers should rarely need to write it.
\end{description}


\sectionAndLabel{Substring indices}{index-substringindex}

The methods
\sunjavadoc{java.base/java/lang/String.html\#indexOf(java.lang.String)}{String.indexOf}
and
\sunjavadoc{java.base/java/lang/String.html\#lastIndexOf(java.lang.String)}{String.lastIndexOf}
return an index of a given substring within a given string, or -1 if no
such substring exists.  The index \<i> returned from
\<receiver.indexOf(substring)> satisfies the following property, which is
stated here in three equivalent ways:
\begin{Verbatim}
 i == -1 || ( i >= 0       && i <= receiver.length() - substring.length()                  )
 i == -1 || ( @NonNegative && @LTLengthOf(value="receiver", offset="substring.length()-1") )
 @SubstringIndexFor(value="receiver", offset="substring.length()-1")
\end{Verbatim}

% This new annotation is similar to \<@LTLengthOf\allowbreak(value =
% "receiver", offset = "substring.length()-1")>, but explicitly allows the
% index to be -1 even if the upper bound would not allow it because of the
% offset.  The Upper Bound Checker can infer the corresponding
% \<@LTLengthOf> annotation for expressions that have a
% \<@SubstringIndexFor> annotation and at the same time are known to be
% non-negative (according to the Lower Bound Checker).

The return type of methods \sunjavadoc{java.base/java/lang/String.html\#indexOf(java.lang.String)}{String.indexOf}
and \sunjavadoc{java.base/java/lang/String.html\#lastIndexOf(java.lang.String)}{String.lastIndexOf} has the annotation
\refqualclasswithparams{checker/index/qual}{SubstringIndexFor}{value="this", offset="\#1.length()-1")}.
This allows writing code such as the following with no warnings from the
Index Checker:

\begin{Verbatim}
  public static String removeSubstring(String original, String removed) {
    int i = original.indexOf(removed);
    if (i != -1) {
      return original.substring(0, i) + original.substring(i + removed.length());
    }
    return original;
  }
\end{Verbatim}

% The code removes the first occurrence of \<removed> from
% \<original>. After checking that \code{i != -1}, the value of \<i> must
% be a valid index for \<original>. Because this index is the start of an
% occurrence of \<removed>, \code{i + removed.length()} is the index of the
% end of the occurrence.  Without the \<@SubstringIndexFor> annotation, the
% Upper Bound Checker would not be able to verify that \code{i +
% removed.length()} is a valid argument to \<substring>, which requires
% both arguments to be \<@IndexOrHigh("original")>.

\begin{figure}
\begin{center}
  \hfill
  \includeimage{substringindex}{3.5cm}
  \hfill
\end{center}
  \caption{The type hierarchy for the Substring Index Checker, which
    captures information about the results of calls to
    \sunjavadoc{java.base/java/lang/String.html\#indexOf(java.lang.String)}{String.indexOf}
    and
    \sunjavadoc{java.base/java/lang/String.html\#lastIndexOf(java.lang.String)}{String.lastIndexOf}.}
  \label{fig-index-substringindex}
\end{figure}

The \<@SubstringIndexFor> annotation is implemented in a Substring Index
Checker that runs together with the Index Checker and has its own type
hierarchy (Figure~\ref{fig-index-substringindex}) with three type
qualifiers:
\begin{description}
\item[\refqualclasswithparams{checker/index/qual}{SubstringIndexFor}{String[] value, String[] offset}]
  An expression with this type represents an integer that could have been
  produced by calling
  \sunjavadoc{java.base/java/lang/String.html\#indexOf(java.lang.String)}{String.indexOf}:
  the annotated integer is either -1, or it is non-negative and is less
  than or equal to \<receiver.length - offset> (where the sequence
  \<receiver> and the offset \<offset> are corresponding elements of the
  annotation's arguments).
\item[\refqualclass{checker/index/qual}{SubstringIndexBottom}]
  This is the bottom type, and programmers should rarely need to write it.
\item[\refqualclass{checker/index/qual}{SubstringIndexUnknown}]
  No information is known about whether this integer is a substring index.
  This is the top type, and programmers should rarely need to write it.
\end{description}


\subsectionAndLabel{The need for the \<@SubstringIndexFor> annotation}{index-substringindex-justification}

No other annotation supported by the Index Checker precisely represents the
possible return values of methods
\sunjavadoc{java.base/java/lang/String.html\#indexOf(java.lang.String)}{String.indexOf}
and
\sunjavadoc{java.base/java/lang/String.html\#lastIndexOf(java.lang.String)}{String.lastIndexOf}.
The reason is the methods' special cases for empty strings and for failed matches.

Consider the result \<i> of \<receiver.indexOf(substring)>:

\begin{itemize}
\item
  \<i> is \<@GTENegativeOne>, because \code{i >= -1}.
\item
  \<i> is \<@LTEqLengthOf("receiver")>, because \code{i <= receiver.length()}.
\item
  \<i> is not \<@IndexOrLow("receiver")>, because for
  \code{receiver = "", substring = "", i = 0}, the property
  \code{i >= -1 \&\& i < receiver.length()} does not hold.
\item
  \<i> is not \<@IndexOrHigh("receiver")>, because for
  \code{receiver = "", substring = "b", i = -1}, the property
  \code{i >= 0 \&\& i <= receiver.length()} does not hold.
\item
  \<i> is not
  \<@LTLengthOf(value = "receiver", offset = "substring.length()-1")>,
  because for \code{receiver = "", substring = "abc", i = -1}, the property
  \code{i + substring.length() - 1 < receiver.length()} does not hold.
\end{itemize}

\noindent
The last annotation in the list above,
\<@LTLengthOf(value = "receiver", offset = "substring.length()-1")>,
is the correct and precise upper bound for all values of \<i> except -1.
The offset expresses the fact that we can add \<substring.length()> to this
index and still get a valid index for \<receiver>.  That is useful for
type-checking code that adds the length of the substring to the found
index, in order to obtain the rest of the string.  However, the upper bound
applies only after the index is explicitly checked not to be -1:

\begin{Verbatim}
  int i = receiver.indexOf(substring);
  // i is @GTENegativeOne and @LTEqLengthOf("receiver")
  // i is not @LTLengthOf(value = "receiver", offset = "substring.length()-1")
  if (i != -1) {
    // i is @NonNegative and @LTLengthOf(value = "receiver", offset = "substring.length()-1")
    int j = i + substring.length();
    // j is @IndexOrHigh("receiver")
    return receiver.substring(j); // this call is safe
  }
\end{Verbatim}

The property of the result of \<indexOf> cannot be expressed by any
combination of lower-bound (Section~\ref{index-lowerbound}) and upper-bound
(Section~\ref{index-upperbound}) annotations, because the upper-bound
annotations apply independently of the lower-bound annotations, but in this
case, the upper bound \code{i <= receiver.length() - substring.length()}
holds only if \code{i >= 0}.  Therefore, to express this property and make
the example type-check without false positives, a new annotation such as
\<@SubstringIndexFor\allowbreak(value = "receiver", offset = "substring.length()-1")>
is necessary.

\sectionAndLabel{Inequalities}{index-inequalities}

The Index Checker estimates which expression's values are less than other expressions' values.

\begin{description}

\item[\refqualclasswithparams{checker/index/qual}{LessThan}{String[] values}]
  An expression with this type has a value that is less than the value of each
  expression listed in \<values>. The expressions in values must be composed of
  final or effectively final variables and constants.

\item[\refqualclass{checker/index/qual}{LessThanUnknown}]
  There is no information about the value of an expression this type relative to other expressions.
  This is the top type, and should not be written by the programmer.

 \item[\refqualclass{checker/index/qual}{LessThanBottom}]
   This is the bottom type for the less than type system. It should
   never need to be written by the programmer.

\end{description}

\sectionAndLabel{Annotating your own fixed-size datatypes}{index-annotating-fixed-size}

The Index Checker has built-in support for Strings and arrays.
You can add support for additional fixed-size data structures by writing
annotations.
This allows the Index Checker to typecheck the data structure's
implementation and to typecheck uses of the class.

This section gives an example:  a fixed-length collection.

%% The code that follows is copied from checker/tests/index/ArrayWrapper.java.
%% If this code is updated, please update that file, too.

\begin{Verbatim}
/** ArrayWrapper is a fixed-size generic collection. */
public class ArrayWrapper<T> {
    private final Object @SameLen("this") [] delegate;

    @SuppressWarnings("index") // constructor creates object of size @SameLen(this) by definition
    ArrayWrapper(@NonNegative int size) {
        delegate = new Object[size];
    }

    public @LengthOf("this") int size() {
        return delegate.length;
    }

    public void set(@IndexFor("this") int index, T obj) {
        delegate[index] = obj;
    }

    @SuppressWarnings("unchecked") // required for normal Java compilation due to unchecked cast
    public T get(@IndexFor("this") int index) {
        return (T) delegate[index];
    }
}
\end{Verbatim}

The Index Checker treats methods annotated with \code{@LengthOf("this")}  as
the length of a sequence like \code{arr.length} for arrays and
\code{str.length()} for strings.

With these annotations, client code like the following typechecks with no
warnings:
\begin{Verbatim}
  public static void clearIndex1(ArrayWrapper<? extends Object> a, @IndexFor("#1") int i) {
    a.set(i, null);
  }

  public static void clearIndex2(ArrayWrapper<? extends Object> a, int i) {
    if (0 <= i && i < a.size()) {
      a.set(i, null);
    }
  }
\end{Verbatim}


\iffalse % Remove "iffalse" when the Mutable Index Checker is publicized.
\sectionAndLabel{The Grow-only Checker: Verifying non-shrinking lists}{growonly-checker}

The Grow-only Checker permits a programmer to specify that certain
collections do not shrink in size.  If a collection is
\refqualclass{checker/index/qual}{GrowOnly}, then the Grow-only Checker
issues a warning if a method like \<remove()> or \<clear()> is called
on the collection.

\begin{Verbatim}
  List<String> lst;
  @GrowOnly List<String> lstGO;

  lst.removeFirst();   // OK
  lstGO.removeFirst(); // error

  lst.add("hello");    // OK
  lstGO.add("hello");  // OK
\end{Verbatim}


Programmers do not run the Grow-only Checker directly.  It is run as part of
the Index Checker, and it improves Index Checker output.  Suppose that integer
\<i> is an index for list \<myList>.  If a method with possible side effects is
called, then \<i> might no longer be a valid index for \<myList>.

\begin{Verbatim}
  class C {
    @NonEmpty List<String> myList;

  void m() {
    int i = myList.size()-1;
    myList.get(i);  // OK
    hasSideEffect();
    myList.get(i);  // compile error: `hasSideEffect` might have removed an element from myList
  }
\end{Verbatim}

In the above code, if \<myList> were
\refqualclass{checker/index/qual}{GrowOnly}, then the last statement would
type-check.  Method \<hasSideEffect> might make \<myList> longer or keep it the
same size, but it cannot invalidate the fact that \<i> is a valid index for
\<myList>.

\begin{figure}
\begin{center}
\includeimage{index-mutability}{5cm}
\end{center}
\caption{Qualifier hierarchy for the Grow-only Checker.}
\label{growonly-mutability-lattice}
\end{figure}

The type qualifiers of Figure~\ref{growonly-mutability-lattice} indicate whether
a sequence might shrink.

\begin{description}
\item[\refqualclass{checker/index/qual}{UnshrinkableRef}]
  The sequence cannot be shrunk through this reference. However, the sequence
  might still shrink if another alias to it is used to perform a shrinking
  operation. This is the default qualifier.  It is the type of the receiver of
  methods like \<List.add()>.

\item[\refqualclass{checker/index/qual}{GrowOnly}]
  The collection will never shrink as long as
  this reference exists. The checker ensures that no shrinking methods can be
  called through this reference, and the type system guarantees that no other
  alias can be used to shrink the collection either.  A grow-only list can be
  created like this:
\begin{Verbatim}
@GrowOnly List<String> list = new @GrowOnly ArrayList<>();
\end{Verbatim}

\item[\refqualclass{checker/index/qual}{Shrinkable}]
  The length of the sequence may be increased or decreased, via methods such as
  \<add()>, \<remove()>, and \<clear()>.

\item[\refqualclass{checker/index/qual}{UncheckedShrinkable}]
  For the purposes of the Grow-Only Checker, this is treated exactly like
  \<@Shrinkable>. It indicates that the collection can shrink. (Its special
  properties related to opting out of index checking are not relevant to the
  GrowOnly Checker).

\item[\refqualclass{checker/index/qual}{PolyGrowShrink}]
  is the polymorphic annotation for the Must Call type system.
  For a description of polymorphic qualifiers, see
  Section~\ref{method-qualifier-polymorphism}.

\end{description}
\fi % iffalse

\sectionAndLabel{Technical papers}{index-papers}

The paper ``Lightweight Verification of Array Indexing'' (ISSTA 2018,
\myurl{https://homes.cs.washington.edu/~mernst/pubs/array-indexing-issta2018-abstract.html})
gives more details about the Index Checker.
``Enforcing correct array indexes with a type system''~\cite{Santino2016} (FSE 2016) describes
an earlier version.


%%  LocalWords:  NegativeArraySizeException pre myArray IndexFor someArray
%%  LocalWords:  MyJavaFile LTLengthOf LTEqLengthOf GTENegativeOne GTE str
%%  LocalWords:  LowerBoundUnknown LTOMLengthOf LTEq LTOM UpperBoundBottom
%%  LocalWords:  UpperBoundUnknown MinLen MinLenBottom SameLen indexOf abc
%%  LocalWords:  SameLenUnknown SameLenBottom lastIndexOf html lang charAt
%%  LocalWords:  LengthOf IndexOrLow PolyIndex PolyLowerBound PolyLength
%%  LocalWords:  PolyUpperBound PolySameLen PolyValue LowerBoundBottom
%%  LocalWords:  lowerbound upperbound EnsuresLTLengthOf targetValue
%%  LocalWords:  EnsuresLTLengthOfIf boolean HasSubsequence LessThan
%%  LocalWords:  minlen ArrayLenRange ArrayLen EnsuresMinLenIf samelen
%%  LocalWords:  searchindex binarySearch SearchIndexFor NegativeIndexFor
%%  LocalWords:  SearchIndex bitwise SearchIndexBottom SearchIndexUnknown
%%  LocalWords:  Substring substringindex substring SubstringIndexFor
%%  LocalWords:  SubstringIndexBottom SubstringIndexUnknown LessThanBottom
%%  LocalWords:  LessThanUnknown typecheck typechecks system'' GrowOnly
%%  LocalWords:  UpperBoundLiteral UnshrinkableRef UncheckedShrinkable
